Beyond Independence: Conditions for the Optimality of the Simple Bayesian Classi er
نویسندگان
چکیده
The simple Bayesian classi er (SBC) is commonly thought to assume that attributes are independent given the class, but this is apparently contradicted by the surprisingly good performance it exhibits in many domains that contain clear attribute dependences. No explanation for this has been proposed so far. In this paper we show that the SBC does not in fact assume attribute independence, and can be optimal even when this assumption is violated by a wide margin. The key to this nding lies in the distinction between classi cation and probability estimation: correct classi cation can be achieved even when the probability estimates used contain large errors. We show that the previously-assumed region of optimality of the SBC is a second-order in nitesimal fraction of the actual one. This is followed by the derivation of several necessary and several su cient conditions for the optimality of the SBC. For example, the SBC is optimal for learning arbitrary conjunctions and disjunctions, even though they violate the independence assumption. The paper also reports empirical evidence of the SBC's competitive performance in domains containing substantial degrees of attribute dependence. 1 THE SIMPLE BAYESIAN
منابع مشابه
Beyond Independence: Conditions for the Optimality of the Simple Bayesian Classifier
The simple Bayesian classi er (SBC) is commonly thought to assume that attributes are independent given the class, but this is apparently contradicted by the surprisingly good performance it exhibits in many domains that contain clear attribute dependences. No explanation for this has been proposed so far. In this paper we show that the SBC does not in fact assume attribute independence, and ca...
متن کاملVisualizing the Simple Bayesian Classi
The simple Bayesian classi er (SBC), sometimes called Naive-Bayes, is built based on a conditional independence model of each attribute given the class. The model was previously shown to be surprisingly robust to obvious violations of this independence assumption, yielding accurate classi cation models even when there are clear conditional dependencies. The SBC can serve as an excellent tool fo...
متن کاملSearching for Dependencies in Bayesian Classifiers
Naive Bayesian classi ers which make independence assumptions perform remarkably well on some data sets but poorly on others. We explore ways to improve the Bayesian classi er by searching for dependencies among attributes. We propose and evaluate two algorithms for detecting dependencies among attributes and show that the backward sequential elimination and joining algorithm provides the most ...
متن کاملAn Effective Bayesian Neural Network Classifier with a Comparison Study to Support Vector Machine
We propose a new Bayesian neural network classier, different from that commonly used in several respects, including the likelihood function, prior specication, and network structure. Under regularity conditions, we show that the decision boundary determined by the new classier will converge to the true one. We also propose a systematic implementation for the new classier. In our implementat...
متن کاملImproving Simple Bayes
The simple Bayesian classi er (SBC), sometimes called Naive-Bayes, is built based on a conditional independence model of each attribute given the class. The model was previously shown to be surprisingly robust to obvious violations of this independence assumption, yielding accurate classi cation models even when there are clear conditional dependencies. We examine di erent approaches for handli...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996